-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Proposed "snappy" vs "x-snappy-framed" content-type confusion #12825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
❌ Your patch status has failed because the patch coverage (80.43%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #12825 +/- ##
==========================================
- Coverage 91.44% 91.42% -0.02%
==========================================
Files 487 487
Lines 26848 26892 +44
==========================================
+ Hits 24551 24586 +35
- Misses 1814 1820 +6
- Partials 483 486 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Please fix the lint issues and add a changelog entry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please link to what is the "authority" that defines the "correct" usage? I am not sure I understand who defines corectness here.
the "block" compression variant of "snappy". This change allows the collector to: | ||
- The server endpoints will now accept "x-snappy-framed" as a valid | ||
content-encoding. | ||
- Client compression type "snappy" will now compress to the "block" variant of snappy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't do this, since as @jpkrohling explained, this will break existing servers that are not updated.
First we need to add support for both on both server/client. The "hack" for the server can also go now. Then wait for few versions until you can change the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What specific behavior do we want here? I tried to avoid breaking receiving as right now that felt like the pain point. Should we add x-snappy-framed as a type first, and then at some point later switch the compression to use non-framed for "snappy"?
If I changed the PR to do this, would this satisfy the path forward and start the ball rolling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your PR avoids breakage in the situation where the receiver is updated but not the sender (and that's great!), but not in the opposite scenario: if the sender is updated but not the receiver, the sender will switch to block mode, and the receiver will not accept it. Of course, the sender can "just" update their config to explicitly switch back to framed mode, but this could still be a source of breakage.
Bogdan's suggestion is to add the protocol detection in the receiver (and the explicit framed mode) now, but to wait a few versions before switching "snappy" to use the block mode. This means that breakage will only occur if the sender gets both updates and the receiver gets none, which is less likely.
I think one additional possibility, to ease the transition and minimize how much code change needs to be postponed, would be to put the switch of snappy from framed to block mode behind an Alpha (off by default) feature gate, and add a warning log in confighttp when "snappy" is used with the gate disabled. Then later, we will only need to switch it to Beta (this is where the breaking change would be). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this suggestion. For the components that need to use snappy encoding/decoding correctly, we just need to document that the new feature gate is required for them to work.
I don't see anybody else requiring this besides the prometheus remote-write receiver, which is not even an alpha component yet, so it sounds reasonable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 with all the comments.
# Use this changelog template to create an entry for release notes. | ||
|
||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: enhancement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely not enhancement
, this is a complete breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep it as enhancement
if we switch the strategy to using feature flags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would no longer be a breaking change if we delay the client changes (unless there are bugs in the protocol detection).
There isn't an authority, except that the common usage is "snappy means block format" and "x-snappy-framed" means the streaming friendly framed format. The only real authority I could find is that in the spec for the framed format, it specifically defines this:
(see https://github.com/google/snappy/blob/main/framing_format.txt) |
Hi @skandragon , thanks for working on this! I'm one of the codeowner of Prometheus Remote Write Receiver 👋 @perebaj and I were working on getting the receiver to alpha stage until we got blocked for several weeks due to prw requests failing to be decoded via httpconfig(open-telemetry/opentelemetry-collector-contrib#38899), until I randomly bumped into your PR today. Now that I've read your PR and the linked issue, things are making sense! One suggestion to make sure your PR works for the remote-write receiver is trying to build the collector-contrib, but updating the collector dependency to your commit SHA and verifying that it works. See open-telemetry/opentelemetry-collector-contrib#38899 (comment) for instructions. I'll try to find some time next week to help out here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👋 -- I've just tested the code changes in open-telemetry/opentelemetry-collector-contrib#39446, it works :)
I agree that it's not that simple to just change things like has been said already. How do you feel about the suggestion of using feature flags instead @skandragon? We could abandon the logic that detects the format by reading the first bytes of the sequence, which also helps with code readability in my opinion 🤔
# Use this changelog template to create an entry for release notes. | ||
|
||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: enhancement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep it as enhancement
if we switch the strategy to using feature flags?
I think removing the protocol detection logic would cause issues once we set the feature flag to beta; you wouldn't be able to update the client before the server. |
@jade-guiton-dd @skandragon @ArthurSens does this apply to gRPC as well? |
It looks like configgrpc uses the "snappy" gRPC compressor from https://github.com/mostynb/go-grpc-compression, which defers to https://github.com/golang/snappy. Not sure which version of the protocol this is implementing though. It seems gRPC doesn't use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe would be nice to update the README with the x-snappy-framed compression type
Howdy! Perhaps we should chat a bit. I want to use the remote write receiver as part of our managed collector framework (cardinalhq.io for reference) where we want to make it easy to migrate from a prometheus shop to an Otel data flow. We are adding scraping configs, but that seems less useful than just getting the firehose from prometheus directly. Let's chat and see if we can help develop the receiver. |
That would be great! We're currently developing it as part of a mentorship, @perebaj is doing most of the heavy work here :) |
You can also ping me on slack/git if any doubt arises. I hope that in the next weeks we will have the prw2 working... |
#### Description This is an alternative PR to #12825. I'm taking all the commits from that PR and adding the feature gate on the client side, as requested by reviews. The server behavior of peeking into the initial bytes to identify the encoding is kept :) If you've reviewed the previous PR, you can just review the latest commit! <!-- Issue number if applicable --> #### Link to tracking issue Fixes #10584 --------- Signed-off-by: Arthur Silva Sens <[email protected]> Co-authored-by: Michael Graff <[email protected]> Co-authored-by: Alex Boten <[email protected]> Co-authored-by: Pablo Baeyens <[email protected]> Co-authored-by: Jonathan <[email protected]>
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Since #12911 was merged, I think this can be closed. |
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [go.opentelemetry.io/collector/component](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/component/componenttest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.125.0` -> `v0.126.0` | | [go.opentelemetry.io/collector/confmap](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/consumer](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/consumer/consumertest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.125.0` -> `v0.126.0` | | [go.opentelemetry.io/collector/pdata](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/processor](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/processor/processortest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.125.0` -> `v0.126.0` | --- ### Release Notes <details> <summary>open-telemetry/opentelemetry-collector (go.opentelemetry.io/collector/component)</summary> ### [`v1.32.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1320v01260) ##### 🛑 Breaking changes 🛑 - `configauth`: Removes deprecated `configauth.Authentication` and `extensionauthtest.NewErrorClient` ([#​12992](open-telemetry/opentelemetry-collector#12992)) The following have been removed: - `configauth.Authentication` use `configauth.Config` instead - `extensionauthtest.NewErrorClient` use `extensionauthtest.NewErr` instead ##### 💡 Enhancements 💡 - `service`: Replace `go.opentelemetry.io/collector/semconv` usage with `go.opentelemetry.io/otel/semconv` ([#​12991](open-telemetry/opentelemetry-collector#12991)) - `confmap`: Update the behavior of the confmap.enableMergeAppendOption feature gate to merge only component lists. ([#​12926](open-telemetry/opentelemetry-collector#12926)) - `service`: Add item count metrics defined in Pipeline Component Telemetry RFC ([#​12812](open-telemetry/opentelemetry-collector#12812)) See [Pipeline Component Telemetry RFC](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/rfcs/component-universal-telemetry.md) for more details: - `otelcol.receiver.produced.items` - `otelcol.processor.consumed.items` - `otelcol.processor.produced.items` - `otelcol.connector.consumed.items` - `otelcol.connector.produced.items` - `otelcol.exporter.consumed.items` - `tls`: Add trusted platform module (TPM) support to TLS authentication. ([#​12801](open-telemetry/opentelemetry-collector#12801)) Now the TLS allows the use of TPM for loading private keys (e.g. in TSS2 format). ##### 🧰 Bug fixes 🧰 - `exporterhelper`: Add validation error for batch config if min_size is greater than queue_size. ([#​12948](open-telemetry/opentelemetry-collector#12948)) - `telemetry`: Allocate less memory per component when OTLP exporting of logs is disabled ([#​13014](open-telemetry/opentelemetry-collector#13014)) - `confmap`: Use reflect.DeepEqual to avoid panic when confmap.enableMergeAppendOption feature gate is enabled. ([#​12932](open-telemetry/opentelemetry-collector#12932)) - `internal telemetry`: Add resource attributes from telemetry.resource to the logger ([#​12582](open-telemetry/opentelemetry-collector#12582)) Resource attributes from telemetry.resource were not added to the internal console logs. Now, they are added to the logger as part of the "resource" field. - `confighttp and configcompression`: Fix handling of `snappy` content-encoding in a backwards-compatible way ([#​10584](open-telemetry/opentelemetry-collector#10584), [#​12825](open-telemetry/opentelemetry-collector#12825)) The collector used the Snappy compression type of "framed" to handle the HTTP content-encoding "snappy". However, this encoding is typically used to indicate the "block" compression variant of "snappy". This change allows the collector to: - When receiving a request with encoding 'snappy', the server endpoints will peek at the first bytes of the payload to determine if it is "framed" or "block" snappy, and will decompress accordingly. This is a backwards-compatible change. If the feature-gate "confighttp.framedSnappy" is enabled, you'll see new behavior for both client and server: - Client compression type "snappy" will now compress to the "block" variant of snappy instead of "framed". Client compression type "x-snappy-framed" will now compress to the "framed" variant of snappy. - Servers will accept both "snappy" and "x-snappy-framed" as valid content-encodings. - `tlsconfig`: Disable TPM tests on MacOS/Darwin ([#​12964](open-telemetry/opentelemetry-collector#12964)) <!-- previous-version --> </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://github.com/renovatebot/renovate/discussions) if that's undesired. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4yNjMuMSIsInVwZGF0ZWRJblZlciI6IjM5LjI2My4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6W119--> Reviewed-on: https://gitea.t000-n.de/t.behrendt/tracebasedlogsampler/pulls/13 Co-authored-by: Renovate Bot <[email protected]> Co-committed-by: Renovate Bot <[email protected]>
Description
The collector http compression and decompression logic uses "snappy" incorrectly. The content-type "snappy" should be the block ("non-framed") version, while "x-snappy-framed" should be the framed one. Unfortunately, the collector code used framed compression and decompression and called it "snappy."
This corrects this in two ways:
x-snappy-framed
as a content-type for both incoming and outgoing compression.In contrast to other proposals in the linked issue, this feels like a clean path forward as warnings about using "snappy" in the docs can be added, and people can either not change (in which case it will just start working using the block format) or change, in which case it will use framed.
Link to tracking issue
Fixes #10584 by providing a way forward
Testing